An Efficient and Sensitive Decision Tree Approach to Mining Concept-Drifting Data Streams

نویسندگان

  • Cheng-Jung Tsai
  • Chien-I Lee
  • Wei-Pang Yang
چکیده

Data stream mining has become a novel research topic of growing interest in knowledge discovery. Most proposed algorithms for data stream mining assume that each data block is basically a random sample from a stationary distribution, but many databases available violate this assumption. That is, the class of an instance may change over time, known as concept drift. In this paper, we propose a Sensitive Concept Drift Probing Decision Tree algorithm (SCRIPT), which is based on the statistical X test, to handle the concept drift problem on data streams. Compared with the proposed methods, the advantages of SCRIPT include: a) it can avoid unnecessary system cost for stable data streams; b) it can immediately and efficiently corrects original classifier while data streams are instable; c) it is more suitable to the applications in which a sensitive detection of concept drift is required.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cost-Sensitive Perceptron Decision Trees for Imbalanced Drifting Data Streams

Mining streaming and drifting data is among the most popular contemporary applications of machine learning methods. Due to the potentially unbounded number of instances arriving rapidly, evolving concepts and limitations imposed on utilized computational resources, there is a need to develop efficient and adaptive algorithms that can handle such problems. These learning difficulties can be furt...

متن کامل

Knowledge Maintenance on Data Streams with Concept Drifting

Concept drifting in data streams often occurs unpredictably at any time. Currently many classification mining algorithms deal with this problem by using an incremental learning approach or ensemble classifiers approach. However, both of them can not make a prediction at any time exactly. In this paper, we propose a novel strategy for the maintenance of knowledge. Our approach stores and maintai...

متن کامل

StreamMiner: A Classifier Ensemble-based Engine to Mine Concept-drifting Data Streams

We demonstrate StreamMiner, a random decision-tree ensemble based engine to mine data streams. A fundamental challenge in data stream mining applications (e.g., credit card transaction authorization, security buysell transaction, and phone call records, etc) is concept-drift or the discrepancy between the previously learned model and the true model in the new data. The basic problem is the abil...

متن کامل

An adaptive ensemble classifier for mining concept drifting data streams

Traditional data mining techniques cannot be directly applied to the real-time data streaming environment. Existing mining classifiers therefore need to be updated frequently to adopt the changes in data streams. In this paper, we address this issue and propose an adaptive ensemble approach for classification and novel class detection in concept-drifting data streams. The proposed approach uses...

متن کامل

Ensemble Classification and Extended Feature Selection for Credit Card Fraud Detection

Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Informatica, Lith. Acad. Sci.

دوره 19  شماره 

صفحات  -

تاریخ انتشار 2008